On Initialization of the Expectation- Maximization Clustering Algorithm

نویسندگان

  • Z. Volkovich
  • M. Golani
چکیده

Iterative clustering algorithms commonly do not lead to optimal cluster solutions. Partitions that are generated by these algorithms are known to be sensitive to the initial partitions that are fed as an input parameter. A “good” selection of initial partitions is an essential clustering problem. In this paper we introduce a new method for constructing the initial partitions set to be used by the Expectation-Maximization clustering algorithm (EM algorithm). Our approach follows ideas from the CrossEntropy method. We use a sample clustering provided by means of the EM algorithm as an alternative for the simulation phase of the Cross-Entropy method. Experimental results reflect a good performance with respect to the offered method.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A survey of model-based clustering algorithms for sequential data

Clustering is a fundamental and widely applied method in understanding and exploring a data set. Interest in clustering has increased recently due to the emergence of several new areas of applications including data mining, bioinformatics, web use data analysis, image analysis and so on. Model-based clustering is one of the most important and widely used clustering methods. This paper presents ...

متن کامل

Incremental Mixture Learning for Clustering Discrete Data

This paper elaborates on an efficient approach for clustering discrete data by incrementally building multinomial mixture models through likelihood maximization using the Expectation-Maximization (EM) algorithm. The method adds sequentially at each step a new multinomial component to a mixture model based on a combined scheme of global and local search in order to deal with the initialization p...

متن کامل

Unsupervised Learning of Finite Gaussian Mixture Models (GMMs): A Greedy Approach

In this work we propose a clustering algorithm that learns on-line a finite gaussian mixture model from multivariate data based on the expectation maximization approach. The convergence of the right number of components as well as their means and covariances is achieved without requiring any careful initialization. Our methodology starts from a single mixture component covering the whole data s...

متن کامل

Unsupervised learning of regression mixture models with unknown number of components

Regression mixture models are widely studied in statistics, machine learning and data analysis. Fitting regression mixtures is challenging and is usually performed by maximum likelihood by using the expectation-maximization (EM) algorithm. However, it is well-known that the initialization is crucial for EM. If the initialization is inappropriately performed, the EM algorithm may lead to unsatis...

متن کامل

An Experimental Comparison of Several Clustering and Initialization Methods

We examine methods for clustering in high dimensions. In the first part of the paper, we perform an experimental comparison between three batch clustering algorithms: the Expectation–Maximization (EM) algorithm, a “winner take all” version of the EM algorithm reminiscent of the K-means algorithm, and model-based hierarchical agglomerative clustering. We learn naive-Bayes models with a hidden ro...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011